We tackle the problem of tracking the human lower body as an initial step toward an automatic motion assessment system for clinical mobility evaluation, using a multimodal system that combines Inertial Measurement Unit (IMU) data, RGB images, and point cloud depth measurements. This system applies the factor graph representation to an optimization problem that provides 3-D skeleton joint estimations. In this paper, we focus on improving the temporal consistency of the estimated human trajectories to greatly extend the range of operability of the depth sensor. More specifically, we introduce a new factor graph factor based on Koopman theory that embeds the nonlinear dynamics of several lower-limb movement activities. This factor performs a two-step process: first, a custom activity recognition module based on spatial temporal graph convolutional networks recognizes the walking activity; then, a Koopman pose prediction of the subsequent skeleton is used as an a priori estimation to drive the optimization problem toward more consistent results. We tested the performance of this module on datasets composed of multiple clinical lowerlimb mobility tests, and we show that our approach reduces outliers on the skeleton form by almost 1 m, while preserving natural walking trajectories at depths up to more than 10 m.
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译
基于中心的聚类(例如,$ k $ -means,$ k $ -Medians)和使用线性子空间的聚类是两种最受欢迎的技术,可以将真实数据分配到较小的群集中。但是,当数据由敏感人群组组成时,不同敏感组的每点的聚集成本显着不同,可能会导致与公平相关的危害(例如,服务质量不同)。社会公平聚类的目的是最大程度地降低所有组中每点聚类的最大成本。在这项工作中,我们提出了一个统一的框架,以解决社会公平的基于中心的聚类和线性子空间聚类,并为这些问题提供实用,高效的近似算法。我们进行了广泛的实验,以表明在多个基准数据集上,我们的算法要么紧密匹配或超越最先进的基线。
translated by 谷歌翻译
上肢控制和功能的丧失是中风后患者的不懈症状。这将使他们的日常生活活动施加艰辛。引入了超级机器人四肢(SRL)作为解决方案,以通过引入独立的新肢体来恢复损失的自由度(DOF)。 SRL中的致动系统可以分为刚性和软致动器。通过固有的安全性,成本和能源效率,软执行器已证明对刚性的刚性有利。但是,它们的刚度低,这危害了其准确性。可变的刚度执行器(VSA)是新开发的技术,已被证明可确保准确性和安全性。在本文中,我们介绍了基于可变刚度执行器的新型超级机器人肢。根据我们的知识,提议的概念验证SRL是第一个利用可变刚度执行器的人。开发的SRL将帮助中风后患者完成双重任务,例如用叉子和刀进食。说明了系统的建模,设计和实现。评估并通过预定义轨迹对其准确性进行了评估和验证。通过利用动量观察者进行碰撞检测来验证安全性,并通过软组织损伤测试评估了几种冲突后反应策略。通过标准的用户满意度问卷对援助过程进行定性验证。
translated by 谷歌翻译
分析运动表现或预防伤害需要捕获人体在某些运动中施加的地面反作用力(GRF)。标准实践在受控环境中使用与力板配对的物理标记,但这是由于高成本,冗长的实现时间和重复实验中的差异所破坏。因此,我们提出了视频中的GRF推论。尽管最近的工作使用LSTM从2D观点估算GRF,但它们的建模和表示能力可能受到限制。首先,我们建议使用变压器体系结构从视频任务中解决GRF,这是第一个这样做的。然后,我们引入了新的损失,以最大程度地减少回归曲线中的高影响峰。我们还表明,对2D到3D人类姿势估计的训练和多任务学习可以提高对看不见动作的概括。在此不同的任务上进行预训练时,在较小(稀有)GRF数据集上进行填充时,可以提供良好的初始权重。我们评估了Laas Parkour和新收集的钳子数据集;与先前的方法相比,我们出现的误差降低了19%。
translated by 谷歌翻译
Recent work has shown the benefits of synthetic data for use in computer vision, with applications ranging from autonomous driving to face landmark detection and reconstruction. There are a number of benefits of using synthetic data from privacy preservation and bias elimination to quality and feasibility of annotation. Generating human-centered synthetic data is a particular challenge in terms of realism and domain-gap, though recent work has shown that effective machine learning models can be trained using synthetic face data alone. We show that this can be extended to include the full body by building on the pipeline of Wood et al. to generate synthetic images of humans in their entirety, with ground-truth annotations for computer vision applications. In this report we describe how we construct a parametric model of the face and body, including articulated hands; our rendering pipeline to generate realistic images of humans based on this body model; an approach for training DNNs to regress a dense set of landmarks covering the entire body; and a method for fitting our body model to dense landmarks predicted from multiple views.
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
We propose an ensemble approach to predict the labels in linear programming word problems. The entity identification and the meaning representation are two types of tasks to be solved in the NL4Opt competition. We propose the ensembleCRF method to identify the named entities for the first task. We found that single models didn't improve for the given task in our analysis. A set of prediction models predict the entities. The generated results are combined to form a consensus result in the ensembleCRF method. We present an ensemble text generator to produce the representation sentences for the second task. We thought of dividing the problem into multiple small tasks due to the overflow in the output. A single model generates different representations based on the prompt. All the generated text is combined to form an ensemble and produce a mathematical meaning of a linear programming problem.
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
This paper deals with the problem of statistical and system heterogeneity in a cross-silo Federated Learning (FL) framework where there exist a limited number of Consumer Internet of Things (CIoT) devices in a smart building. We propose a novel Graph Signal Processing (GSP)-inspired aggregation rule based on graph filtering dubbed ``G-Fedfilt''. The proposed aggregator enables a structured flow of information based on the graph's topology. This behavior allows capturing the interconnection of CIoT devices and training domain-specific models. The embedded graph filter is equipped with a tunable parameter which enables a continuous trade-off between domain-agnostic and domain-specific FL. In the case of domain-agnostic, it forces G-Fedfilt to act similar to the conventional Federated Averaging (FedAvg) aggregation rule. The proposed G-Fedfilt also enables an intrinsic smooth clustering based on the graph connectivity without explicitly specified which further boosts the personalization of the models in the framework. In addition, the proposed scheme enjoys a communication-efficient time-scheduling to alleviate the system heterogeneity. This is accomplished by adaptively adjusting the amount of training data samples and sparsity of the models' gradients to reduce communication desynchronization and latency. Simulation results show that the proposed G-Fedfilt achieves up to $3.99\% $ better classification accuracy than the conventional FedAvg when concerning model personalization on the statistically heterogeneous local datasets, while it is capable of yielding up to $2.41\%$ higher accuracy than FedAvg in the case of testing the generalization of the models.
translated by 谷歌翻译